NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Linear Recursive Feature Machines provably recover low-rank matrices

https://doi.org/10.1073/pnas.2411325122

Radhakrishnan, Adityanarayanan; Belkin, Mikhail; Drusvyatskiy, Dmitriy (April 2025, Proceedings of the National Academy of Sciences)

A fundamental problem in machine learning is to understand how neural networks make accurate predictions, while seemingly bypassing the curse of dimensionality. A possible explanation is that common training algorithms for neural networks implicitly perform dimensionality reduction—a process called feature learning. Recent work [A. Radhakrishnan, D. Beaglehole, P. Pandit, M. Belkin,Science383, 1461–1467 (2024).] posited that the effects of feature learning can be elicited from a classical statistical estimator called the average gradient outer product (AGOP). The authors proposed Recursive Feature Machines (RFMs) as an algorithm that explicitly performs feature learning by alternating between 1) reweighting the feature vectors by the AGOP and 2) learning the prediction function in the transformed space. In this work, we develop theoretical guarantees for how RFM performs dimensionality reduction by focusing on the class of overparameterized problems arising in sparse linear regression and low-rank matrix recovery. Specifically, we show that RFM restricted to linear models (lin-RFM) reduces to a variant of the well-studied Iteratively Reweighted Least Squares (IRLS) algorithm. Furthermore, our results connect feature learning in neural networks and classical sparse recovery algorithms and shed light on how neural networks recover low rank structure from data. In addition, we provide an implementation of lin-RFM that scales to matrices with millions of missing entries. Our implementation is faster than the standard IRLS algorithms since it avoids forming singular value decompositions. It also outperforms deep linear networks for sparse linear regression and low-rank matrix completion.
more » « less
Free, publicly-accessible full text available April 1, 2026
Transfer Learning with Kernel Methods

https://doi.org/10.1038/s41467-023-41215-8

Radhakrishnan, Adityanarayanan; Ruiz_Luyten, Max; Prasad, Neha; Uhler, Caroline (September 2023, Nature Communications)

Abstract Transfer learning refers to the process of adapting a model trained on a source task to a target task. While kernel methods are conceptually and computationally simple models that are competitive on a variety of tasks, it has been unclear how to develop scalable kernel-based transfer learning methods across general source and target tasks with possibly differing label dimensions. In this work, we propose a transfer learning framework for kernel methods by projecting and translating the source model to the target task. We demonstrate the effectiveness of our framework in applications to image classification and virtual drug screening. For both applications, we identify simple scaling laws that characterize the performance of transfer-learned kernels as a function of the number of target examples. We explain this phenomenon in a simplified linear setting, where we are able to derive the exact scaling laws.
more » « less
Wide and deep neural networks achieve consistency for classification

https://doi.org/10.1073/pnas.2208779120

Radhakrishnan, Adityanarayanan; Belkin, Mikhail; Uhler, Caroline (April 2023, Proceedings of the National Academy of Sciences)

While neural networks are used for classification tasks across domains, a long-standing open problem in machine learning is determining whether neural networks trained using standard procedures are consistent for classification, i.e., whether such models minimize the probability of misclassification for arbitrary data distributions. In this work, we identify and construct an explicit set of neural network classifiers that are consistent. Since effective neural networks in practice are typically both wide and deep, we analyze infinitely wide networks that are also infinitely deep. In particular, using the recent connection between infinitely wide neural networks and neural tangent kernels, we provide explicit activation functions that can be used to construct networks that achieve consistency. Interestingly, these activation functions are simple and easy to implement, yet differ from commonly used activations such as ReLU or sigmoid. More generally, we create a taxonomy of infinitely wide and deep networks and show that these models implement one of three well-known classifiers depending on the activation function used: 1) 1-nearest neighbor (model predictions are given by the label of the nearest training example); 2) majority vote (model predictions are given by the label of the class with the greatest representation in the training set); or 3) singular kernel classifiers (a set of classifiers containing those that achieve consistency). Our results highlight the benefit of using deep networks for classification tasks, in contrast to regression tasks, where excessive depth is harmful.
more » « less
Full Text Available
Simple, fast, and flexible framework for matrix completion with infinite width neural networks

https://doi.org/10.1073/pnas.2115064119

Radhakrishnan, Adityanarayanan; Stefanakis, George; Belkin, Mikhail; Uhler, Caroline (April 2022, Proceedings of the National Academy of Sciences)

Matrix completion problems arise in many applications including recommendation systems, computer vision, and genomics. Increasingly larger neural networks have been successful in many of these applications but at considerable computational costs. Remarkably, taking the width of a neural network to infinity allows for improved computational performance. In this work, we develop an infinite width neural network framework for matrix completion that is simple, fast, and flexible. Simplicity and speed come from the connection between the infinite width limit of neural networks and kernels known as neural tangent kernels (NTK). In particular, we derive the NTK for fully connected and convolutional neural networks for matrix completion. The flexibility stems from a feature prior, which allows encoding relationships between coordinates of the target matrix, akin to semisupervised learning. The effectiveness of our framework is demonstrated through competitive results for virtual drug screening and image inpainting/reconstruction. We also provide an implementation in Python to make our framework accessible on standard hardware to a broad audience.
more » « less
Full Text Available
Cross-modal autoencoder framework learns holistic representations of cardiovascular state

https://doi.org/10.1038/s41467-023-38125-0

Radhakrishnan, Adityanarayanan; Friedman, Sam F.; Khurshid, Shaan; Ng, Kenney; Batra, Puneet; Lubitz, Steven A.; Philippakis, Anthony A.; Uhler, Caroline (April 2023, Nature Communications)

Abstract A fundamental challenge in diagnostics is integrating multiple modalities to develop a joint characterization of physiological state. Using the heart as a model system, we develop a cross-modal autoencoder framework for integrating distinct data modalities and constructing a holistic representation of cardiovascular state. In particular, we use our framework to construct such cross-modal representations from cardiac magnetic resonance images (MRIs), containing structural information, and electrocardiograms (ECGs), containing myoelectric information. We leverage the learned cross-modal representation to (1) improve phenotype prediction from a single, accessible phenotype such as ECGs; (2) enable imputation of hard-to-acquire cardiac MRIs from easy-to-acquire ECGs; and (3) develop a framework for performing genome-wide association studies in an unsupervised manner. Our results systematically integrate distinct diagnostic modalities into a common representation that better characterizes physiologic state.
more » « less
Overparameterized neural networks implement associative memory

https://doi.org/10.1073/PNAS.2005013117

Radhakrishnan, Adityanarayanan; Belkin, Mikhail; Uhler, Caroline (November 2020, Proceedings of the National Academy of Sciences)
null (Ed.)
Identifying computational mechanisms for memorization and retrieval of data is a long-standing problem at the intersection of machine learning and neuroscience. Our main finding is that standard overparameterized deep neural networks trained using standard optimization methods implement such a mechanism for real-valued data. We provide empirical evidence that 1) overparameterized autoencoders store training samples as attractors and thus iterating the learned map leads to sample recovery, and that 2) the same mechanism allows for encoding sequences of examples and serves as an even more efficient mechanism for memory than autoencoding. Theoretically, we prove that when trained on a single example, autoencoders store the example as an attractor. Lastly, by treating a sequence encoder as a composition of maps, we prove that sequence encoding provides a more efficient mechanism for memory than autoencoding.
more » « less
Full Text Available
Causal network models of SARS-CoV-2 expression and aging to identify candidates for drug repurposing

https://doi.org/10.1038/s41467-021-21056-z

Belyaeva, Anastasiya; Cammarata, Louis; Radhakrishnan, Adityanarayanan; Squires, Chandler; Yang, Karren Dai; Shivashankar, G. V.; Uhler, Caroline (February 2021, Nature Communications)

Abstract Given the severity of the SARS-CoV-2 pandemic, a major challenge is to rapidly repurpose existing approved drugs for clinical interventions. While a number of data-driven and experimental approaches have been suggested in the context of drug repurposing, a platform that systematically integrates available transcriptomic, proteomic and structural data is missing. More importantly, given that SARS-CoV-2 pathogenicity is highly age-dependent, it is critical to integrate aging signatures into drug discovery platforms. We here take advantage of large-scale transcriptional drug screens combined with RNA-seq data of the lung epithelium with SARS-CoV-2 infection as well as the aging lung. To identify robust druggable protein targets, we propose a principled causal framework that makes use of multiple data modalities. Our analysis highlights the importance of serine/threonine and tyrosine kinases as potential targets that intersect the SARS-CoV-2 and aging pathways. By integrating transcriptomic, proteomic and structural data that is available for many diseases, our drug discovery platform is broadly applicable. Rigorous in vitro experiments as well as clinical trials are needed to validate the identified candidate drugs.
more » « less
Multi-domain translation between single-cell imaging and sequencing data using autoencoders

https://doi.org/10.1038/s41467-020-20249-2

Yang, Karren Dai; Belyaeva, Anastasiya; Venkatachalapathy, Saradha; Damodaran, Karthik; Katcoff, Abigail; Radhakrishnan, Adityanarayanan; Shivashankar, G. V.; Uhler, Caroline (January 2021, Nature Communications)

Abstract The development of single-cell methods for capturing different data modalities including imaging and sequencing has revolutionized our ability to identify heterogeneous cell states. Different data modalities provide different perspectives on a population of cells, and their integration is critical for studying cellular heterogeneity and its function. While various methods have been proposed to integrate different sequencing data modalities, coupling imaging and sequencing has been an open challenge. We here present an approach for integrating vastly different modalities by learning a probabilistic coupling between the different data modalities using autoencoders to map to a shared latent space. We validate this approach by integrating single-cell RNA-seq and chromatin images to identify distinct subpopulations of human naive CD4+ T-cells that are poised for activation. Collectively, our approach provides a framework to integrate and translate between data modalities that cannot yet be measured within the same cell for diverse applications in biomedical discovery.
more » « less
Counting Markov equivalence classes for DAG models on trees

https://doi.org/10.1016/j.dam.2018.03.015

Radhakrishnan, Adityanarayanan; Solus, Liam; Uhler, Caroline (July 2018, Discrete Applied Mathematics)

Full Text Available
Machine Learning for Nuclear Mechano-Morphometric Biomarkers in Cancer Diagnosis

https://doi.org/10.1038/s41598-017-17858-1

Radhakrishnan, Adityanarayanan; Damodaran, Karthik; Soylemezoglu, Ali C.; Uhler, Caroline; Shivashankar, G. V. (December 2017, Scientific Reports)

Full Text Available

Search for: All records